We study only algorithms for real matrices, which are most commonly used in the applications described in this course.
For more details, see A. Kaylor Cline and I. Dhillon, Computation of the Singular Value Decomposition and the references therein.
The reader should be familiar with facts about the singular value decomposition and perturbation theory and algorithms for the symmetric eigenvalue decomposition.
The reader should be able to apply an adequate algorithm to a given problem, and to assess the accuracy of the solution.
The singular value decomposition (SVD) of $A\in\mathbb{R}^{m\times n}$ is $A=U\Sigma V^T$, where $U\in\mathbb{R}^{m\times m}$ is orthogonal, $U^TU=UU^T=I_m$, $V\in\mathbb{R}^{n\times n}$ is orthogonal, $V^TV=VV^T=I_n$, and $\Sigma \in\mathbb{R}^{m\times n}$ is diagonal with singular values $\sigma_1,\ldots,\sigma_{\min\{m,n\}}$ on the diagonal.
If $m>n$, the thin SVD of $A$ is $A=U_{1:m,1:n} \Sigma_{1:n,1:n} V^T$.
Algorithms for computing SVD of $A$ are modifications of algorithms for the symmetric eigenvalue decomposition of the matrices $AA^T$, $A^TA$ and $\begin{bmatrix} 0 & A\\ A^T & 0 \end{bmatrix}$.
Most commonly used approach is the three-step algorithm:
If $m\geq n$, the overall operation count for this algorithm is $O(mn^2)$ operations.
Error bounds. Let $U\Sigma V^T$ and $\tilde U \tilde \Sigma \tilde V^T$ be the exact and the computed SVDs of $A$, respectively. The algorithms generally compute the SVD with errors bounded by $$ |\sigma_i-\tilde \sigma_i|\leq \phi \epsilon\|A\|_2, \qquad \|u_i-\tilde u_i\|_2, \| v_i-\tilde v_i\|_2 \leq \psi\epsilon \frac{\|A\|_2} {\min_{j\neq i} |\sigma_i-\tilde \sigma_j|}, $$ where $\epsilon$ is machine precision and $\phi$ and $\psi$ are slowly growing polynomial functions of $n$ which depend upon the algorithm used (typically $O(n)$ or $O(n^2)$). These bounds are obtained by combining perturbation bounds with the floating-point error analysis of the algorithms.
The reduction of $A$ to bidiagonal matrix can be performed by applying $\min\{m-1,n\}$ Householder reflections $H_L$ from the left and $n-2$ Householder reflections $H_R$ from the right. In the first step, $H_L$ is chosen to annihilate all elements of the first column below the diagonal, and $H_R$ is chosen to annihilate all elements of the first row right of the first super-diagonal. Applying this procedure recursively yields the bidiagonal matrix.
$H_L$ and $H_R$ do not depend on the normalization of the respective Householder vectors $v_L$ and $v_R$. With the normalization $[v_L]_1=[V_R]_1=1$, the vectors $v_L$ are stored in the lower-triangular part of $A$, and the vectors $v_R$ are stored in the upper-triangular part of $A$ above the super-diagonal.
The matrices $H_L$ and $H_R$ are not formed explicitly - given $v_L$ and $v_R$, $A$ is overwritten with $H_L A H_R$ in $O(mn)$ operations by using matrix-vector multiplication and rank-one updates.
Instead of performing rank-one updates, $p$ transformations can be accumulated, and then applied. This block algorithm is rich in matrix-matrix multiplications (roughly one half of the operations is performed using BLAS 3 routines), but it requires extra workspace.
If the matrices $X$ and $Y$ are needed explicitly, they can be computed from the stored Householder vectors. In order to minimize the operation count, the computation starts from the smallest matrix and the size is gradually increased.
The backward error bounds for the bidiagonalization are as follows: The computed matrix $\tilde B$ is equal to the matrix which would be obtained by exact bidiagonalization of some perturbed matrix $A+\Delta A$, where $\|\Delta A\|_2 \leq \psi \varepsilon \|A\|_2$ and $\psi$ is a slowly increasing function of $n$. The computed matrices $\tilde X$ and $\tilde Y$ satisfy $\tilde X=X+\Delta X$ and $\tilde Y=Y+\Delta Y$, where $\|\Delta X \|_2,\|\Delta Y\|_2\leq \phi \varepsilon$ and $\phi$ is a slowly increasing function of $n$.
The bidiagonal reduction is implemented in the LAPACK subroutine DGEBRD. The computation of $X$ and $Y$ is implemented in the subroutine DORGBR, which is not yet wrapped in Julia.
Bidiagonalization can also be performed using Givens rotations.
Givens rotations act more selectively than Householder reflectors, and
are useful if $A$ has some special structure, for example, if $A$ is a banded matrix.
Error bounds for function myBidiagG()
are the same as above,
but with slightly different functions $\psi$ and $\phi$.
In [36]:
m=8
n=5
import Random
Random.seed!(421)
A=rand(-9.0:9,m,n)
Out[36]:
In [39]:
function H(x)
v=copy(x)
v[1]+=sign(x[1])*norm(x)
display(v/v[1])
I-(2/(v⋅v))*v*v'
end
Out[39]:
In [40]:
H1=H(A[:,1])
Out[40]:
In [16]:
H1'*H1
Out[16]:
In [19]:
B=H1*A
Out[19]:
In [35]:
H2=H(B[1,2:5])
Out[35]:
In [23]:
H3=[1 zeros(1,4);zeros(4) H2]
Out[23]:
In [24]:
B*H3
Out[24]:
In [25]:
H1*A*H3
Out[25]:
In [2]:
using LinearAlgebra
In [41]:
?LAPACK.gebrd!
Out[41]:
In [42]:
# We need copy()
Outg=LAPACK.gebrd!(copy(A))
Out[42]:
In [45]:
Outg[4]
Out[45]:
In [28]:
B=Bidiagonal(Outg[2],Outg[3][1:end-1],'U')
Out[28]:
In [32]:
[svdvals(A) svdvals(B)]
Out[32]:
In [46]:
# Extract X
function myBidiagX(H::Matrix)
m,n=size(H)
T=typeof(H[1,1])
X = Matrix{T}(I,m,n)
v = Array{T}(undef,m)
for j = n : -1 : 1
v[j] = one(T)
v[j+1:m] = H[j+1:m, j]
γ = -2 / (v[j:m]⋅v[j:m])
w = γ * X[j:m, j:n]'*v[j:m]
X[j:m, j:n] = X[j:m, j:n] + v[j:m]*w'
end
X
end
# Extract Y
function myBidiagY(H::AbstractMatrix)
n,m=size(H)
T=typeof(H[1,1])
Y = Matrix{T}(I,n,n)
v = Array{T}(undef,n)
for j = n-2 : -1 : 1
v[j+1] = one(T)
v[j+2:n] = H[j+2:n, j]
γ = -2 / (v[j+1:n]⋅v[j+1:n])
w = γ * Y[j+1:n, j+1:n]'*v[j+1:n]
Y[j+1:n, j+1:n] = Y[j+1:n, j+1:n] + v[j+1:n]*w'
end
Y
end
Out[46]:
In [47]:
X=myBidiagX(Outg[1])
Out[47]:
In [48]:
Y=myBidiagY(Outg[1]')
Out[48]:
In [49]:
# Orthogonality
norm(X'*X-I), norm(Y'*Y-I)
Out[49]:
In [50]:
# Residual error
norm(A*Y-X*B)
Out[50]:
In [53]:
# Bidiagonalization using Givens rotations
function myBidiagG(A1::Matrix)
A=deepcopy(A1)
m,n=size(A)
T=typeof(A[1,1])
X=Matrix{T}(I,m,m)
Y=Matrix{T}(I,n,n)
for j = 1 : n
for i = j+1 : m
G,r=givens(A,j,i,j)
# Use the faster in-place variant
# A=G*A
lmul!(G,A)
# X=G*X
lmul!(G,X)
# display(A)
end
for i=j+2:n
G,r=givens(A',j+1,i,j)
# A*=adjoint(G)
rmul!(A,adjoint(G))
# Y*=adjoint(G)
rmul!(Y,adjoint(G))
# display(A)
end
end
X',Bidiagonal(diag(A),diag(A,1),'U'), Y
end
Out[53]:
In [54]:
X₁, B₁, Y₁ = myBidiagG(A)
Out[54]:
In [55]:
# Orthogonality
norm(X₁'*X₁-I), norm(Y₁'*Y₁-I)
Out[55]:
In [56]:
# Diagonalization
X₁'*A*Y₁
Out[56]:
In [57]:
# X, Y and B are not unique
B
Out[57]:
In [58]:
B₁
Out[58]:
In [61]:
Matrix(B)'*Matrix(B)
Out[61]:
Let $B$ be a real upper-bidiagonal matrix of order $n$ and let $B=W\Sigma Z^T$ be its SVD.
All metods for computing the SVD of bidiagonal matrix are derived from the methods for computing the EVD of the tridiagonal matrix $T=B^T B$.
The shift $\mu$ is the eigenvalue of the $2\times 2$ matrix $T_{n-1:n,n-1:n}$ which is closer to $T_{n,n}$. The first Givens rotation from the right is the one which annihilates the element $(1,2)$ of the shifted $2\times 2$ matrix $T_{1:2,1:2}-\mu I$. Applying this rotation to $B$ creates the bulge at the element $B_{2,1}$. This bulge is subsequently chased out by applying adequate Givens rotations alternating from the left and from the right. This is the Golub-Kahan algorithm.
The computed SVD satisfes error bounds from the Fact 4 above.
The special variant of zero-shift QR algorithm (the Demmel-Kahan algorithm) computes the singular values with high relative accuracy.
The tridiagonal divide-and-conquer method, bisection and inverse iteration, and MRRR method can also be adapted for bidiagonal matrices.
Zero shift QR algorithm for bidiagonal matrices is implemented in the LAPACK routine
DBDSQR. It is also used in the function svdvals()
. Divide-and-conquer algorithm for bidiagonal matrices is implemented in the LAPACK routine
DBDSDC. However, this algorithm also calls zero-shift QR to compute singular values.
In [62]:
W,σ,Z=svd(B)
Out[62]:
In [65]:
@which svd(B)
Out[65]:
In [66]:
σ₁=svdvals(B)
Out[66]:
In [67]:
@which svdvals(B)
Out[67]:
In [68]:
σ-σ₁
Out[68]:
In [69]:
?LAPACK.bdsqr!
Out[69]:
In [70]:
B.dv
Out[70]:
In [71]:
one(B)
Out[71]:
In [77]:
BV=Matrix(one(B))
BU=Matrix(one(B))
BC=Matrix(one(B))
σ₂,Z₂,W₂,C = LAPACK.bdsqr!('U',copy(B.dv),copy(B.ev),BV,BU,BC)
Out[77]:
In [78]:
W₂'*B*Z₂'
Out[78]:
In [79]:
?LAPACK.bdsdc!
Out[79]:
In [80]:
σ₃,ee,W₃,Z₃,rest=LAPACK.bdsdc!('U','I',copy(B.dv),copy(B.ev))
Out[80]:
In [81]:
W₃'*B*Z₃'
Out[81]:
Functions svd()
, LAPACK.bdsqr!()
and LAPACK.bdsdc!()
use the same algorithm to compute singular values.
In [82]:
[σ₃-σ₂ σ₃-σ]
Out[82]:
Let us compute some timings. We observe $O(n^2)$ operations.
In [84]:
n=1000
Abig=Bidiagonal(rand(n),rand(n-1),'U')
Bbig=Bidiagonal(rand(2*n),rand(2*n-1),'U')
@time svdvals(Abig);
@time svdvals(Bbig);
@time LAPACK.bdsdc!('U','N',copy(Abig.dv),copy(Abig.ev));
@time svd(Abig);
@time svd(Bbig);
Final algorithm is obtained by combining bidiagonalization and bidiagonal SVD methods. Standard method is implemented in the LAPACK routine DGESVD. Divide-and-conquer method is implemented in the LAPACK routine DGESDD.
The functions svd()
, svdvals()
, and svdvecs()
use DGESDD
.
Wrappers for DGESVD
and DGESDD
give more control about output of eigenvectors.
In [85]:
# The built-in algorithm
U,σA,V=svd(A)
Out[85]:
In [88]:
Sv=svd(A)
Out[88]:
In [92]:
Sv.Vt
Out[92]:
In [93]:
# With our building blocks
U₁=X*W
V₁=Y*Z
U₁'*A*V₁
Out[93]:
In [94]:
?LAPACK.gesvd!
Out[94]:
In [95]:
# DGESVD
LAPACK.gesvd!('A','A',copy(A))
Out[95]:
In [96]:
?LAPACK.gesdd!
Out[96]:
In [97]:
LAPACK.gesdd!('N',copy(A))
Out[97]:
Let us perform some timings. We observe $O(n^3)$ operations.
In [98]:
n=1000
Abig=rand(n,n)
Bbig=rand(2*n,2*n)
@time Ubig,σbig,Vbig=svd(Abig);
@time svd(Bbig);
@time LAPACK.gesvd!('A','A',copy(Abig));
@time LAPACK.gesdd!('A',copy(Abig));
@time LAPACK.gesdd!('A',copy(Bbig));
In [99]:
# Residual
norm(Abig*Vbig-Ubig*Diagonal(σbig))
Out[99]:
In [ ]: